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1. INTRODUCTION 

The volume of mobile traffic has greatly increased due to the proliferation of “data-hungry” devices. 
Every successive year since the early 2000s has been witnessing a mammoth growth in the volume of traffic 
that existing network infrastructure are struggling to manage. Cellular network operators keep up with mobile 
traffic demand using a lot of options but the effect of many intervention programmes has not clearly alleviated 
the problem of mobile traffic congestion especially in crowded places and areas with intermittent spike in 
traffic demands [1]. Expenditure on cellular network upgrades is prohibitive and many mobile network 
operators are looking at innovative ways to increase revenue, lower operating and capital expenditure without 
compromising quality of service (QoS) requirements. This has been a really challenging problem for mobile 
operators as implemented solutions have not produced satisfactory results especially in areas with large mobile 
data traffic. But a new paradigm that is emerging for the handling of mobile traffic is subscriber classification 
[2]. The network traffic generated by mobile devices have varying characteristics that can be explored for their 
classification. 

In recent times, it has been discovered that a great percentage of cellular network traffic emanates 
from indoor environments [1] because traffic congestions are usually experienced most of the time within 
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indoor environments. Thus, serious consideration is being given to the possibility of offloading mobile data 
traffic through alternative means that are inherently suited for the indoor environment. Subscriber classification 
is the process of categorizing cellular network traffic automatically [3] which enables tailoring of various 
policies to the peculiarities of the classes of subscribers identified [4]. The subscribers may be classified based 
on the volume of traffic generated, the protocol it uses, port numbers, deep packet inspection through signature 
analysis, pay-load sensitivity and other important metrics are used by mobile network operators to classify 
subscribers to ensure each class get the QoS outlined in the service-level agreements. 

Cellular network subscriber classification has been used in fraud detection, cybersecurity and various 
degrees of measurement applications. Due to the problem of mobile traffic congestion and the attendant 
degradation in QoS, lots of mobile network operators are seriously considering the option of Wi-Fi offloading 
[5]. Wi-Fi offloading as known in the mobile network operators’ parlance, is the automatic switching from 
cellular network to Wi-Fi without any perceptible interference to the connectivity of the subscriber [6]. The 
Wi-Fi network is either leased or maintained by the mobile network operator. The Wi-Fi network is able to 
automatically register mobile devices due to special features present in the mobile devices and the Wi-Fi access 
point [7]. Wi-Fi offloading is particularly attractive to mobile network operators because of its cheapness, 
reliability and efficiency. Wi-Fi offloading, when done automatically has a very potent capability to improve 
cellular network data availability. 

A scalable solution for identifying influential subscribers from a telecommunication network’s bank 
of subscribers was proposed by [8] and importance of classifying subscribers was mentioned in [9] using fuzzy- 
clustering algorithm. Research by Magnusson et al. [8] machine learning was utilized for subscriber 
classification using weighted social network analysis (SNA) metrics. The technique made it possible to 
aggregate several metrics and classify millions of subscribers. The result showed that the proposed solution 
was scalable and accurate. A group of researchers in [10] also classified network subscribers using social 
network analysis. The subscribers were classified using the complex relationships between all the subscribers 
in the network. Machine learning tools were also applied. The amount of energy produced by 
telecommunication equipment operators being consumed by subscribers formed the basics of their 
classification in [11]. 

Traffic classification was noted by the authors in [12] as the science of subscriber service 
differentiation that helps with network design. The authors submitted that traffic/subscriber classification 
problems keep evolving due to the prolific and varied ways subscribers use the available spectrum. The 
classification of access network types was proposed by [13] to enable the setup of networks with improved 
protocol and application performance. The intrinsic characteristic of the network, entropy of packet pair inter- 
arrival times and median were used to classify various networks used by subscribers. Ethernet, wireless local 
area network and low-bandwidth connections were the classes or categories of network created by the study. 
Attributes, rather than their addresses and locations was suggested by researchers in [14] as a means of 
classifying subscribers. 

Data as assets was recognised by [15] as a viable means of increasing the return on revenue for 
connection service providers. A category engine that could generate a category vector for all users was 
proposed as the means of classifying subscribers based on internet preferences. The goal of the classifier was 
to track the subscribers to enable the design of personalized policies tailored to each class of subscribers. The 
use of load balancing, queuing theory, flow size statistics and a threshold policy for smart offloading of data 
was suggested by [16]. The smart mobile data offloading system assigns user traffic to either the Wi-Fi or 
cellular network based on the data rate supported by each mode of traffic offloading. Macro base stations can 
handle large data traffic through offloading at hours of peak traffic. Traffic offloaded to wireless access points 
[17], [18]. A virtual congestion-optimal Wi-Fi offloading based sub-gradient algorithm was used to optimally 
set the offloading ratio needed in a Cellular-Wi-Fi network to ensure maximum throughput [19]. 

Traffic offloading from vehicle peers in a vehicle-to-vehicle (V2V) network to Wi-Fi-by-the road was 
considered by [20]. Software defined radio was employed by [21] to help with the decision to either route 
network traffic through installed Wi-Fi access points or Wi-Fi based D2D links. Wi-Fi offloading was assisted 
using cache in [22] which lowers waiting time during offloading. Wi-Fi offloading through licensed assisted 
LTE access which implements connection admission control was used by [23] to minimize issues with 
bandwidth availability and QoS. Multi-rate and single-rate Apps for Wi-Fi offloading are options that are 
considered cellular traffic needs to be decongested [24]. Delayed packet offloading via Wi-Fi was studied by 
[25]. The delayed offloading model was subjected to different performance gauging metrics to determine how 
effectively it helped with network traffic decongestion and optimal network offloading options. 
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2. RESEARCH METHOD 
2.1. Subscriber classification 

The classification enables the switching mechanism to categorize the four usage classes (very high, 
high, medium and low) defined for users thereby classifying a user at every instant of the day based on previous 
characteristics of its data usage. Using the MATLAB neural network tool, a neural network was created by 
defining the network type, the input data, the desired output, the training function, the number of hidden layers, 
the number of neurons in each hidden layer and the neuron transfer function. The network type used was the 
feed-forward backpropagation. The inputs here are in a matrix of 8760 columns and with 437 rows which 
correspond to user events as in Figure | and Figure 2 showing the neural network training tool during and after 
training, validation and testing. 
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Figure 1. Neural network training Figure 2. MATLAB neural fitting training network 


Values in the columns are determined by the bandwidth and mean of all the current connections at 
that point in time. In the target matrix, the values are gotten by the already set calculation parameters that 
classify each user by given levels and assign the switching based on the system architecture design. The training 
functions typically used are gradient descent (traingd), scaled gradient descent (trainsgd), and Levenberg- 
Marquardt (trainlm), depending on the experiments. With the volume of data that was to be analysed, the 
“trainlm” function was used and hidden neurons were adjusted until a satisfactory value was achieved. A small 
segment of the data captured from the remote database Firefox and translated to MATLAB data array is shown 
in [5] which serves as the input data for the ANN training model. Each column represents a unique user, each 
row represents a time interval and each cell represents the internet traffic data consumed within the time interval 
and classification of users based on their consumption weight. The data captured from the cloud database is 
queried to be the ANN model data input and target which were divided randomly into the training set, the 
validation set, and the test set. The training set is a 70% partition of the entire data set, and the other two sets 
are 15% partitions of the entire data set. The performance was measured using the Mean Square Error (MSE) 
for the validation set. The MSE was calculated using (1). The neural network regression for each segment of 
the data and error histogram are shown in Figure 3 and Figure 4 respectively. 


MSE = +Y}, (nj; — d;)? (1) 


Where N is the size of the data set. Once the best—performing network was found, that is having the lowest 
MSE for the validation set, the training phase is completed. The purpose for the validation sets used during 
training of the network is to adjust its weights and biases of the ANN model while the test sets used is to 
measure the network performance. Each experiment is a neural network model run with a different combination 
of number of hidden neurons and learning algorithm. The search is for a neural network that gives a low MSE 
for the validation set using a low number of epochs. The basic parameters provided to create a neural network 
are the maximum number of epochs (1000), time (infinite), and minimum gradient value (10-7). The sigmoid 
function is the chosen transfer function, as it is commonly used in pattern classification with the 
backpropagation method. Five experiments were run using nntool, with the number of hidden neurons ranging 
from 32 to 300. The levenberg—marquardt algorithm was used as the learning rule which gave higher accuracy 
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than other algorithms like scaled gradient descent. Table 1 shows the summary of the experiments and the best 
accuracy was 94.18%. 
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Figure 3. Neural network regression 


Table 1. Classification accuracy against number of hidden neurons 
No. of hidden neurons No. of epochs Percentage correct classification 


32 12 12.3 
64 40 22.10 
128 233 40.86 
256 32 83.10 
300 83 94.18 


2.2. Bandwidth computation 

The bandwidth utilized is a summation of the uplink and downlink which is buffered for each user 
session and it makes it possible to get the total amount of traffic data passing through a network device at a 
given time. A user i connected to the internet on a cellular network would have made a download of d; and 
upload of u; in a time interval t;. If the user logs on x; sessions in a day then total bandwidth b; bits used in a 
day and the throughput in bits per seconds (bps) are given respectively in (2) and (3). 


b; = Dil, + up) (2) 
i= f= (3) 


The bandwidth consumed by jth user who logs on for session x; is bj; and where x represents his 
different logon sessions. If there exists n users on the network, then, daily data consumption on the network by 
all users is B; and the corresponding network throughput is £; as given in (4) and (5) respectively. If y represents 
the number of days in a month and for k" day in a month, implies that bandwidth consumed by a user for days: 
Yis V2 Vay. , Yk per month is Bg, so, bandwidth consumption on a monthly basis can be expressed generally 
as in (6) which is a three-dimensional matrix recursively computed and corresponding network throughput is as 
given in (7) while B; and b;; are 2-dimensional arrays. These are network performance evaluation parameters that 
can be logged for each month of the year and used for various statistical analysis of the network. 


By = Xj-1 Lig bij (4) 
bij 

Bj = fet Diets, (5) 

By = i Dei viet bijg (6) 
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=- yy ijk 
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2.3. Quality of service defined criteria 

Bandwidth utilization (U) is the amount of bandwidth B. used by consumers in relation to the total 
bandwidth B, available from network service provider and 75% has been considered good in this research. QoS 
delivery is a function of the bandwidth utilization factor given in (8) while bandwidth supplied from provider 
is as represented in (9) where B,,, By, and B, are respectively bandwidth of the cellular network, bandwidth of 
wifi network and total bandwidth from the service providers. The users’ quality of experience (QoE) is defined 
as in (10) where Rg is the throughput experienced by the user while R, is the system throughput. The 
Throughput demanded by user is the average amount of data the user regularly consumes at the specific time 
of the day or over the month. 


U= a 100, where B, < B, always (8) 
B. = w + By, if wifi is present 9 

S Bm, if wifi is not present (9) 
QoE =i, 100 (10) 


Ss 


2.4. Wi-Fi offloading algorithm 

The process involves analysis of every users traffic database to obtain their bandwidth consumption, 
bandwidth utilization and throughput as part of the switching process which requires users’ bandwidth 
consumption at every point in time. Also, MATLAB artificial neural network was used to predict the 
consumption level of the user and determine congestion status of the cellular network. If the average QoS of 
users is very low, the system recognizes the need to decongest the cellular network hence Wi-Fi Offload should 
be performed. The algorithm of the process is presented below. This shows the processes of switching between 
cellular and Wi-Fi networks based on the parameters explained above. Much of the switching is automated and 
this is due to the learning algorithm, that determines switchable behaviour, based on pattern that are unique to 
certain times and locations. 
Start: 
Step 1: get total bandwidth B+, supplied by network, available Wifi bandwidth Bọ and throughput supplied 
Bs; 
Step 2: Compute bandwidth and throughput demanded respectively: 


Step 3: Do while; number of active users N > 0 
Step 4: Compute bandwidth consumed B, = Bg + Bw 


Step 5: Compute Utilization: U = x 100, B, < Bs always 


Step 6: QoS and QoE evaluation criteria: 
Default cellular utilization: 0.75: 

If U > 0.75; switch to WiFi 

If U < 0.75; remain active on cellular network 


QoE = a 100 


Step 7: Go to Step 2 
Stop 

In addition, bandwidth of the available cellular network and Wi-Fi network is determined and 
aggregated by the available service providers. It determines the number of active users in current session, there 
must be active users else the program shuts down. Hence, it loops through the number of active users for further 
analysis. Analyse the bandwidth utilization of the user in the given day of the month and month of the year. 
QoS is determined, buffered and users to be offloaded to Wi-Fi network (switch) or remain in cellular network 
are also determined. 


3. RESULTS AND DISCUSSION 

Over a hundred-hour period, the peak number of users for the cellular network only configuration was 
seventy-six (76) is shown in Figure 4. The lowest number of users recorded was sixty-three (63) and the average 
number of cellular network users was seventy (70). The total number of hundred users was observed over the 
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hundred-hour period. The reason for the less than maximum number of users in the network at any given time 
was due to irregularity in connections and also the fact that some users may not be constantly connected. The 
high blocking probability resulting from congestion and anti-congestion algorithms designed to ensure that 
users are prevented from connecting to the cellular network once it gets congested can also be responsible. 

This argument is buttressed by the bandwidth utilization plot in Figure 5. The average bandwidth 
utilization without any form of offloading is 95% and reaches 100% at several points in a hundred-hour period. 
This shows that the bandwidth utilization is at a critical point and could lead to a lot of call drops and a higher 
blocking probability for the users of the cellular network. The utilization plot also shows the need to implement 
some form of palliative or relief for the congested and fully utilized cellular network. 
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With the implementation of an offloading technique, in this case Wi-Fi, the bandwidth utilization 
drops to between 74% and 76% even at peak traffic periods. In Figure 6, the total bandwidth utilization for the 
cellular network and inclusive of the traffic handled by the adaptive Wi-Fi connection falls to about 85% even 
at the peak network traffic period. The adaptive Wi-Fi offloading scheme implemented on the network is 
capable of decongesting the network and admitting all users that wish to connect to the network. The adaptive 
Wi-Fi offloading scheme takes some of the burden of traffic handling from the cellular network and ensures 
that the network does not get congested (that is does not reach 100% bandwidth utilization). As seen from 
Figure 7, the blocking probability has dropped drastically as all hundred (100) users are admitted into the 
network due to the decongestion work of the proposed adaptive Wi-Fi offloading scheme implemented on the 
network. Thus, at periods of peak traffic, all users that wish to connect to the network are admitted without any 
form of blocking or dropping. 


Quality Of Service 
| j 


120% 


100% 


80% f I“, 


60% at 


40% 


Percentage of Bandwidth Utilization 
4 
5 
“Tis 
=| = 
=< 
x 
a 
P 
Number of Users 


20% 


Cellular Network 
— — — Wi-Fi Network 


"; 79 a ioa 

g 0 hrs 20 hrs 40 hrs 60 hrs 80 hrs 100 hrs 0 10 20 30 40 50 60 70 80 90 100 
Time Time (hrs) 

Figure 6. Mean bandwidth utilization after Figure 7. Mobile network with wi-fi-offloading 


decongestion 


Bulletin of Electr Eng & Inf, Vol. 11, No. 2, April 2022: 917-925 


Bulletin of Electr Eng & Inf ISSN: 2302-9285 Oo 923 


The maximum achievable throughput for the cellular network without any adaptive Wi-Fi offload is 
shown by Figure 8 to be about 4.6 Mbps at peak traffic. The cellular network has a limit in terms of the 
throughput especially at peak traffic periods. This is true, even when users wish to send and receive more 
information per second, the congested cellular network still places an upper bound on the peak amount of data 
that can be transmitted by all users in the network. Thus, the performance of the network without any form of 
Wi-Fi offloading is less than satisfactory considering the fact that at peak traffic period, about 20% of users 
are not admitted into the network. But with the implementation of the proposed adaptive Wi-Fi offloading 
scheme that is based on subscriber classification, all users are not only admitted at peak traffic periods, but the 
throughput reaches as high as 7 Mbps at peak traffic periods. This is evidence enough to show the massive 
gain, 50.53% of implementing the proposed adaptive Wi-Fi offloading scheme on the congestion prone 
network. 
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Figure 8. Congested mobile network without wifi-offload 


4. CONCLUSION 

This study has used the neural network approach to aid automatic switching mechanisms for mobile 
network users in a Wi-Fi offloading technology using bandwidth, QoS and mean available bandwidth of the 
current mobile network and the alternative Wi-Fi network. Adaptive Wi-Fi offload has proved to be a very 
effective means of decongesting cellular network. The experiments showed that bandwidth utilization, QoS 
and throughput are adequately optimized. To further improve the percentage accuracy, a larger data set should 
be considered. Different attributes characterizing different types of devices, locations could be used as 
additional features for the neural network. All the experiments of this work use the same partitioning. The 
experiments with the re—initialization of the weights and biases also use the same partitioning. Thus, it may be 
worthwhile to rerun all experiments with different partitions of the given data. The supervised learning method 
of feed-forward backpropagation has been used here. Experiments can be done using other learning methods 
such as clustering and self—organizing. In addition, the support vector machine can be used instead of the neural 
network. 
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